Analyzed Scores and Performances (ASAP)

Curtis Elsasser

Overview

Music is a mystery. Our ears love it and are quick to process it, but our brains struggle to understand it.

  • I am a musician and I love music.
  • I am a budding data scientist and I love data.
  • Music theory is not always satisfying.
  • I want to look for patterns in music that a machine may be better at finding.
  • Different perspectives nurture new discoveries.
  • I have wanted to view music through the lens of a computer for ages. What better time than now?

The Data

Follows are the composers and their work represented in the catalog.

Scores vs. Performances

Performances and scores are very closely related, but they are not the same. The score is the composition as written by the composer. The performance is the composition as played by the performer. Classical music is a very structured genre, but the performance of it is very expressive. It’s very difficult to reproduce it in it’s entirety with metadata such as dynamics. Where the tempo, key-signature and time-signature are meaningful in the score, they are meaningless in the performances in this repository. Simply, the score is the blueprint, the performance is the building.

Schema: the Catalog

Schema: the Score/Performance

MIDI?

MIDI = Musical Instrument Digital Interface.

Type Performer Media
Audio Elaine Lee
MIDI Performance Elaine Lee
MIDI Score NA

The Well-Tempered Clavier I No. 3 in C-sharp major (BWV 848) by J.S. Bach

The Question

Did composition note variance increase between 1685 and 1953?

I believe did. To test this, I will:

  • Divide the data into two portions: 1685 - 1799 and 1799 - 1953
  • Use an f-test to determine if there is a significant difference in note variance.

Independent variable: time.

Dependent variable: note variance.

Composers

Note Variance

Assumptions

  1. Is data is normally distributed?

QQ Plot

Numbers

1685 - 1799

mean sd N
0.5 0.097 250163

1799 - 1953

mean SD N
0.504 0.112 431444

Ratio of SDs: \[ \frac{var\text{(1799 - 1953)}}{var\text{(1685 - 1799)}} = 1.155\]

Assumptions (cont.)

  1. Is data is normally distributed?

Yes, sufficiently enough for us to proceed.

  1. Independence?

Yes, each note is independent of the other notes.

  1. Homogeneity of variance?

I calculated SD for both portions of the dataset and found that they are close with a difference of ~0.01, which is 1% of our range. This is acceptable.

Hypothesis

\(H_0\): There is no significant difference in note variance over time.

\(H_1\): There is a significant difference in note variance over time.

F Test

var.test(music_new, music_old)


    F test to compare two variances

data:  tbl_music_new$note_normal and tbl_music_old$note_normal
F = 1.333, num df = 431443, denom df = 250162, p-value < 2.2e-16
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 1.325578 1.340352
sample estimates:
ratio of variances 
          1.332955 

Conclusion

The p-value is crazy small (2.2e-16) and the confidence interval (1.326578 1.340352) is fairly high, which gives me reason to believe that I can reject the null hypothesis. There is a significant difference in note variance between the two portions of the dataset.

Important?

It is important in the same way that history is important. It informs us of who we are and where we might be going. Interestingly, I would guess that the variance in note values has decreased in the past ~70 years. But this does not make us bad people. Variance in music does not correlate to quality. Egads, I wouldn’t want to touch that experiment with a ten-foot pole.

Limitations?

  • Do the samples faithfully represent the periods between 1685 - 1953?
    • Not randomly selected. They are the big-shots.
  • Are the works included representative of their composers?
  • My own limitations. I am still struggling somewhat with statistical tests.

Naughty or Nice

The Well-Tempered Clavier I No. 3 in C-sharp major

It is the most impossible key in the whole of the Wohltemperirte Clavier: C-sharp major. No fewer than seven sharps adorn the beginning of each staff. Furthermore, it is an unnecessarily complicated key, as instead of seven sharps you could use five flats to write exactly the same pitch – as D-flat major. In 1728, the music theorist Johann David Heinichen therefore classified C-sharp major as one of the ‘superfluous keys’. Here, Bach is deliberately toying with the mind of the keyboard player, as the instinctive correspondence between the black noteheads on the paper and the fingers on the keys no longer works.

Patrick Ayrton